Cross-Language Learning for Program Classification using BTBCNNs
نویسندگان
چکیده
Towards the vision of automatically translating code that implements an algorithm from one programming language into another, this paper proposes an approach for automated program classifications using bilateral tree-based convolutional neural networks (BiTBCNNs). It is layered on top of two tree-based convolutional neural networks (TBCNNs), each of which recognizes the algorithm of code written in an individual programming language. The combination layer of the networks recognizes the similarities and differences among code in different programming languages. The BiTBCNNs are trained using the source code in different languages but known to implement the same algorithms and/or functionalities. For a preliminary evaluation, we use 3591 Java and 3534 C++ code snippets from 6 algorithms we crawled systematically from GitHub. We obtained 90+% accuracy in the cross-language binary classification task to tell whether any given two code snippets implement a same algorithm. Also, for the actual algorithm classification task, i.e., to predict the algorithm label of an arbitrary C++ code snippet, we achieve 80.5% for the precision. Therefore, the capability of BiTBCNNs for classifying programs into algorithms across programming languages may beome a useful building block towards automated program translation.
منابع مشابه
A Cross-sectional Study of Oral Communication Strategies by Successful EFL Learners
This paper reports on how top EFL students foster oral communication strategies (OCSs) throughout their 4-year English program at the university level. It is a cross-sectional study of 40 EFL learners enrolled in the Department of English, Faculty of Education, Taiz University, Yemen. Data were collected through a questionnaire based on Oxford’s Strategy inventory for language learning (SILL). ...
متن کاملA Cross-sectional Study of Oral Communication Strategies by Successful EFL Learners
This paper reports on how top EFL students foster oral communication strategies (OCSs) throughout their 4-year English program at the university level. It is a cross-sectional study of 40 EFL learners enrolled in the Department of English, Faculty of Education, Taiz University, Yemen. Data were collected through a questionnaire based on Oxford’s Strategy inventory for language learning (SILL). ...
متن کاملThe Influence of Data-Driven Exercises Through Using a Computer Program on Vocabulary Improvement in an EFL Context
The present study was conducted to evaluate data driven learning (DDL) combined with Computer Assisted Language Learning (CALL) as an approach to improving vocabulary knowledge of Iranian postgraduates majoring in teaching English, English literature and translation. The purpose was to help language learners get familiar with DDL as a student-centered method taking advantage of a computer progr...
متن کاملThe Influence of Data-Driven Exercises Through Using a Computer Program on Vocabulary Improvement in an EFL Context
The present study was conducted to evaluate data driven learning (DDL) combined with Computer Assisted Language Learning (CALL) as an approach to improving vocabulary knowledge of Iranian postgraduates majoring in teaching English, English literature and translation. The purpose was to help language learners get familiar with DDL as a student-centered method taking advantage of a computer progr...
متن کاملSemi-Supervised Matrix Completion for Cross-Lingual Text Classification
Cross-lingual text classification is the task of assigning labels to observed documents in a label-scarce target language domain by using a prediction model trained with labeled documents from a label-rich source language domain. Cross-lingual text classification is popularly studied in natural language processing area to reduce the expensive manual annotation effort required in the target lang...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017